Mock exam 2018 - solutions

Question 1

(a) definition of a Normal form game:

An $N$ player normal form game consists of:

  • A finite set of $N$ players
  • Strategy spaces for the players: $\{S_1,S_2,S_3,\dots,S_N\}$;
  • Payoff functions for the players: $u_i:S_1\times S_2\dots\times S_N\to\mathbb{R}$

[2]

(b) Identifying best responses

We have:

  • $u_r((1, 0), (y, 1-y))= y - 2 (1 - y)=3y-2$ and $u_r((0, 1), (y, 1-y))= -2y + (1 - y)=1-3y$
  • $u_c((x, 1 - x), (1, 0))=-2x+(1-x)=1-3x$ and $u_c((x, 1- x), (0, 1))=x-2(1-x)=3x-2$

In [1]:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
plt.figure()
ys = [0, 1]
plt.plot(ys, [3 * y - 2 for y in ys], label="$u_r((1, 0), (y, 1-y))$")
plt.plot(ys, [1 - 3 * y for y in ys], label="$u_r((0, 1), (y, 1-y))$")
plt.legend()
plt.xlabel("$y$")
plt.title("Utility to row player");


[1]


In [2]:
plt.figure()
xs = [0, 1]
plt.plot(xs, [1 - x * 3 for x in xs], label="$u_c((x, 1 - x), (1, 0))$")
plt.plot(xs, [3 * x - 2 for x in xs], label="$u_c((x, 1 - x), (0, 1))$")
plt.legend()
plt.xlabel("$x$")
plt.title("Utility to column player");


[1]

The best responses are then given by:

$$ \sigma_r^* = \begin{cases} (0,1)&\text{ if }y< 1/2\\ (1,0)&\text{ if }y> 1/2\\ \text{indifferent}&\text{ otherwise} \end{cases} $$$$ \sigma_c^* = \begin{cases} (1,0)&\text{ if }x< 1/2\\ (0,1)&\text{ if }x> 1/2\\ \text{indifferent}&\text{ otherwise} \end{cases} $$

[1]

(c) Proof of best response condition

$(A\sigma_c^T)_i$ is the utility of the row player when they play their $i$th strategy. Thus:

$$\sigma_rA\sigma_c^T=\sum_{i=1}^{m}{\sigma_r}_i(A\sigma_c^T)_i$$

[2]

Let $u=\max_{k}(A\sigma_c^T)_k$. Thus:

$$ \begin{aligned} \sigma_rA\sigma_c^T&=\sum_{i=1}^{m}{\sigma_r}_i(u - u + (A\sigma_c^T)_i)\\ &=\sum_{i=1}^{m}{\sigma_r}_iu - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i)\\ &=u - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i) \end{aligned}$$

[2]

We know that $u - (A\sigma_c^T)_i\geq 0$, thus the largest $\sigma_rA\sigma_c^T$ can be is $u$ which occurs iff ${\sigma_r}_i > 0 \Rightarrow (A\sigma_c^T)_i = u$ as required.

[1]

(d) Finding the Nash equilibrium

There are no pairs of pure strategies that are best response to each other (visibile from the payoff matrices). We consider mixed strategies $\sigma_r=(x, 1- x)$ and $\sigma_c=(y, 1-y)$ [1]

For $\sigma_r$ to be a best response to $\sigma_c$, from the theorem we have:

$$u_r((1, 0), \sigma_c)=u_r((0, 1), \sigma_c)\Rightarrow 3y-2=1-3y\Rightarrow y=1/2$$$$u_c(\sigma_r, (1, 0))=u_c(\sigma_r, (0, 1))\Rightarrow 1-3x=3x-2\Rightarrow x=1/2$$

[2]

The Nash equilibrium is $((1/2, 1/2), (1/2, 1/2))$.

This is confirmed by the best response findings of question (b): these two strategies are best responses to each other. [1]

(e) Discussing the Knight et al. 2017 paper

  • (i): This paper builds a Markov queueing model based on strategic interactions between two hospitals [1]. It measures the Price of Anarchy to see the effect on patients of rational behaviour [1]. It does all this by comparing the Nash equilibrium to the optimal behaviour [1].
  • (ii): The main theoretic result is concerning the behaviour of the best response functions. [1] The paper proves that they are increasing functions which ensures there is a single point of intersection. [1]
  • (iii): Various possibilities each worth [2]:
    • A stationary queueing model whilst in practice a time dependent model would be more appropriate.
    • Only considering interaction between two hospitals.
    • Just a single cutoff strategy.
  • (iv): Depending on the choice of limiting factor [3]:
    • Use a non stationary queueing model, potentially simulating the queueing process in a time dependent way.
    • Consider a multi player game although this becomes harder to generalise the theoretic result.
    • Use multiple cutoff strategies (this would be the simplest approach).

Question 2

(a) Definition of repeated game

Given a two player game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, referred to as a stage game, a $T$-stage repeated game is a game in which players play that stage game for $T > 0$ periods. Players make decisions based on the full history of play over all the periods.

[2]

(b) Definition of a strategy in a repeated game

Given a two player stage game $(A,B)\in{\mathbb{R}^{m\times n}}^2$, repeated to give a $T$-stage repeated game. A strategy is a mapping from the entire history of play to an action of the stage game:

$$ \bigcup_{t=0}^{T-1}H(t)\to a $$

where:

  • $H(t)$ is the history of player of both players up until stage $t$ ($H(0)=(\emptyset, \emptyset)$)
  • $a$ is an action (for either player) of the stage game

[2]

(c) Size of history space

For a given integer value of $0\leq t< T$ there are $|S_1|^t$ possible histories for player 1 and $|S_2|^t$ possible histories for player 2. Thus the total number of histories is: [2]

$$\sum_{t=0}^{T - 1}|S_1|^t|S_2|^t$$

[1]

which gives:

$$\sum_{t=0}^{T - 1}|S_1|^t|S_2|^t=\sum_{t=0}^{T - 1}(|S_1||S_2|)^t=\frac{1 - (|S_1||S_2|)^T}{1 - |S_1||S_2|}$$

[2]

(d) Listing strategy spaces

$S_1\{r_1, r_2\}$, $S_2=\{c_1, c_2, c_3\}$ and $T=2$

$$\{(\emptyset, \emptyset), (r_1, c_1), (r_1, c_2), (r_1, c_3), (r_2, c_1), (r_2, c_2), (r_2, c_3)\}$$

[3]

(e) Sequence of stage Nash theorem

Consider the following strategy:

The row/column player should play action $a_{r/c}$ regardless of the play of any previous strategy profiles. [2]

where $(a_{r}, a_{c})$ is a given stage Nash equilibrium.

Using backwards induction, this is a Nash equilibrium for the last stage game. Thus, at the last stage, no player has a reason to deviate. Similarly at the $T-1$th stage. The proof follows.

[2]

(f) Pure Nash equilibria

The pure Nash equilibria:

$$A = \begin{pmatrix}\underline{3} & \underline{6} & \underline{1}\\1 & 2 & 0\\\end{pmatrix} \qquad B = \begin{pmatrix}0 & \underline{7} & \underline{7}\\\underline{20} & 1 & 0\\\end{pmatrix}$$

Thus, for our example we have the four Nash equilibria:

  • $(r_1r_1, c_2c_2)$ with utility: (12, 14). [1]
  • $(r_1r_1, c_2c_3)$ with utility: (7, 14). [1]
  • $(r_1r_1, c_3c_2)$ with utility: (7, 14). [1]
  • $(r_1r_1, c_3c_3)$ with utility: (2, 14). [1]

(g) Note state Nash, equilibria

Consider the following two strategies:

  1. For the row player:

    $$(\emptyset, \emptyset) \to r_2$$ $$(r_2, c_1) \to r_1$$ $$(r_2, c_2) \to r_1$$ $$(r_2, c_3) \to r_1$$

    [2]

  2. For the column player:

    $$(\emptyset, \emptyset) \to c_1$$ $$(r_1, c_1) \to c_3$$ $$(r_2, c_1) \to c_2$$

    [2]

This is a Nash equilibria because:

  1. If the row player deviates, they would only be rational to do so in the first stage, if they did they would gain 2 in that stage but lose 5 in the second stage. Thus they have no incentive to deviate.
  2. If the column player deviates, they would only do so in the first stage and gain no utility.

[2]

Question 3

(a) Defining the row/col best response polytopes

For a two player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the row/column player best response polytope $\mathcal{P}$/$\mathcal{Q}$ is defined by:

[1]

$$ \mathcal{P} = \left\{x\in\mathbb{R}^{m}\;|\;x\geq 0; xB\leq 1\right\} $$

[1]

$$ \mathcal{Q} = \left\{y\in\mathbb{R}^{n}\;|\; Ay\leq 1; y\geq 0 \right\} $$

[1]

(b) Degeneracy of the game

The column player has two best responses to the first row thus the game is degenerate. [1]

(c) Obtain half space definitions:

Directly apply the definition: [1]

Note that any modifications of $A, B$ are accepted (so as to make them $>0$).

For $\mathcal{P}$ :

$$ \begin{aligned} x_1 &\geq 0\\ x_2 &\geq 0\\ 20x_2 & \leq 1\\ 7 x_1 + x_2 & \leq 1\\ 7 x_1& \leq 1\\ \end{aligned} $$

[2]

For $\mathcal{Q}$ :

$$ \begin{aligned} 3y_1 +6y_2 + y_3 & \leq 1\\ y_1 +2y_2 & \leq 1\\ y_1 &\geq 0\\ y_2 &\geq 0\\ y_3 &\geq 0\\ \end{aligned} $$

[2]

(d) Confirming labelling:

There are an infinite number of possible vertices (based on potential modifications of $A, B$). Thus it makes sense to consider the strategies that correspond to the normalised vertices given (recall the question is not asking to find the vertices):

For $\mathcal{P}$:

  • $x=(0, 0)\text{ with labels: }\{0, 1\}$: $x_1=0$ and $x_2=0$
  • $x=(1/7, 0)\rightarrow \sigma_r=(1, 0)\text{ with labels: }\{1, 3, 4\}$. Label $1$ because $x_2=0$. Labels $\{3, 4\}$ because the 2nd and 3rd columns are best responses.
  • $x=(0, 1/20)\rightarrow \sigma_r=(0, 1)\text{ with labels: }\{0, 2\}$: Label $0$ because $x_1=0$. Label $2$ because the first column is a best response.
  • $x=(19/140, 1/20)\rightarrow \sigma_r=(19/26, 7/26)\text{ with labels: }\{2, 3\}$: we have $\sigma_r B = (70/13, 70/13, 133/26)$ thus columns 1 and 2 are best responses.

[2]

For $\mathcal{Q}$:

  • $y=(0, 0, 0)\text{ with labels: }\{2, 3, 4\}$: $y_1=0$, $y_2=0$ and $y_3=0$
  • $y=(0, 0, 1)\rightarrow \sigma_c=(0, 0, 1)\text{ with labels: }\{0, 2, 3\}$: 2, 3 is immediate, 0 because best response to 3rd column is first row.
  • $y=(0, 1/6, 0)\rightarrow \sigma_c=(0, 1, 0)\text{ with labels: }\{0, 2, 4\}$: 2, 4 is immediate, 0 because best response to 2nd column is first row.
  • $y=(1/3, 0, 0)\rightarrow \sigma_c=(1, 0, 0)\text{ with labels: }\{0, 3, 4\}$: 3, 4 is immediate, 0 because best response to 1st column is first row.

[2]


In [3]:
import nashpy as nash


A = np.array([[3, 6, 1], 
              [1, 2, 0]])
B = np.array([[0, 7, 7], 
              [20, 1, 0]])

row_halfspaces = nash.polytope.build_halfspaces(B.transpose())
for vertex, labels in nash.polytope.non_trivial_vertices(row_halfspaces):
    print(np.round(vertex / sum(vertex), 4))


[1. 0.]
[0. 1.]
[0.7308 0.2692]

In [4]:
col_halfspaces = nash.polytope.build_halfspaces(A)
for vertex, labels in nash.polytope.non_trivial_vertices(col_halfspaces):
    print(np.round(vertex / sum(vertex), 4))


[0. 0. 1.]
[0. 1. 0.]
[1. 0. 0.]

(e) The vertex enumeration algorithm

For a nondegenerate 2 player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the following algorithm returns all nash equilibria:

  1. For all pairs of vertices of the best response polytopes
  2. Check if the vertices have full labels
  3. Return the normalised probabilities

[2]

(f) Use the vertex enumeration algorithm

The only vertex pairs with a full set of labels:

  • $x=(1/7, 0)\text{ with labels: }\{1, 3, 4\}$ and $y=(0, 0, 1)\text{ with labels: }\{0, 2, 3\}$ [2]
  • $x=(1/7, 0)\text{ with labels: }\{1, 3, 4\}$ and $y=(0, 1/6, 0)\text{ with labels: }\{0, 2, 4\}$ [2]

This corresponds to:

$$\{((1, 0), (0, 0, 1)), ((1, 0), (0, 1, 0))\}$$

[1]


In [5]:
game = nash.Game(A, B)
for eq in game.vertex_enumeration():
    print([np.round(s, 2) for s in eq])


[array([1., 0.]), array([0., 0., 1.])]
[array([1., 0.]), array([0., 1., 0.])]

(g) The L-H Algorithm

For a nondegenerate 2 player game $(A, B)\in{\mathbb{R}^{m\times n}_{>0}}^2$ the following algorithm returns a nash equilibria:

  1. Start at the artificial equilibrium: $(0, 0)$
  2. Choose a label to drop.
  3. Remove this label from the corresponding vertex by traversing an edge of the corresponding polytope to another vertex.
  4. The new vertex will now have a duplicate label in the other polytope. Remove this label from the vertex of the other polytope and traverse an edge of that polytope to another vertex.
  5. Repeat step 4 until the pair of vertices is fully labelled.

(h) Using the L-H Algorithm

Note that the game is degenerate but that we can still attempt to use the algorithm:

  • $((0, 0), (0, 0, 0))$ have labels: $\{0, 1\}, \{2, 3, 4\}$. Drop 0 (arbitrary decision) in $\mathcal{P}$. [1]
  • $\to ((1/7, 0), (0, 0, 0))$ have labels: $\{1, 3, 4\}, \{2, 3, 4\}$. In $\mathcal{Q}$ drop 3 (arbitrary decision, could have dropped 4). [1]
  • $\to ((1/7, 0), (0, 1/6, 0))$ have labels: $\{1, 3, 4\}, \{0, 2, 4\}$. [1]

We have a fully labelled vertex pair (and one of the same equilibria as before).

There are numerous possible implimentations of this algorithm, here is another:

  • $((0, 0), (0, 0, 0))$ have labels: $\{0, 1\}, \{2, 3, 4\}$. Drop 2 (arbitrary decision) in $\mathcal{Q}$.
  • $\to ((0, 0), (1/3, 0, 0))$ have labels: $\{0, 1\}, \{0, 3, 4\}$. In $\mathcal{P}$ drop 0.
  • $\to ((1/7, 0), (1/3, 0, 0))$ have labels: $\{1, 3, 4\}, \{0, 3, 4\}$. In $\mathcal{Q}$ drop 4 (arbitrary decision, could have dropped 3).
  • $\to ((1/7, 0), (0, 0, 1))$ have labels $\{1, 3, 4\}, \{0, 2, 3\}$

We have a fully labelled vertex pair (and one of the same equilibria as before).

Question 4

(a) Definition of a Moran process on a game

Consider a matrix $A\in\mathbb{R}^{m\times n}$ representing a game with two strategies.

$$ A= \begin{pmatrix} a & b\\ c & d \end{pmatrix} $$

The Moran process is as follows:

  • At a given time step: all individuals play all other individuals.
  • Obtain their fitness as given by the game.
  • Randomly select an individual proportional to their fitness as an individual to be reproduced
  • Uniformly select an individual to be replaced
  • Proceed to the next time step.
  • The process terminates when there is only one type of individual in the population.

[4]

(b) Theorem

Given a birth death process, the fixation probability $x_i$ is given by:

$$x_i=\frac{1+\sum_{j=1}^{i-1}\prod_{k=1}^j\gamma_k}{1+\sum_{j=1}^{N-1}\prod_{k=1}^j\gamma_k}$$

where:

$$ \gamma_k = \frac{p_{k,k-1}}{p_{k,k+1}} $$

[2]

The Moran process is a birth death process where the transition probabilities are then given by:

$$p_{i,i+1}=\frac{if_{1i}}{if_{1i} + (N-i)f_{2i}}\frac{N-i}{N}$$$$p_{i,i-1}=\frac{(N-i)f_{2i}}{if_{1i} + (N-i)f_{2i}}\frac{i}{N}$$

[2]

which gives:

$$\gamma_i=\frac{f_{2i}}{f_{1i}}$$

[1]

thus (using the general birth death process result):

$$ x_i=\frac{1+\sum_{j=1}^{i-1}\prod_{k=1}^j\frac{f_{2k}}{f_{1k}}}{1+\sum_{j=1}^{N-1}\prod_{k=1}^j\frac{f_{2k}}{f_{1k}}} $$

[1]

(c) Moran process for the game

Assuming $i$ individuals of the first type, for this game we have $N=3$ and $(a, b, c, d)=(0, 2, r, 0)$ the fitness of both types is given respectively by:

$$f_{1i}=\frac{a(i-1)+b(N-i)}{N-1}=\frac{6-2i}{2}=3-i$$$$f_{2i}=\frac{c(i)+d(N-i-1)}{N-1}=\frac{ri}{2}=\frac{ri}{2}$$

which gives:

$$\gamma_i=\frac{f_{2i}}{f_{1i}}=\frac{ri}{6-2i}$$

thus:

$$ x_1=\frac{1}{1+\sum_{j=1}^{2}\prod_{k=1}^j\frac{rk}{6-2k}}=\frac{1}{1 + \frac{r}{4} + \frac{r}{4}\frac{2r}{2}}=\frac{1}{\frac{r^{2}}{4} + \frac{r}{4} + 1} $$

for $r=2$ we get:

$$ x_1 = 2 / 5 $$

Some code to verify the result:


In [6]:
def theoretic_fixation(N, game, i=1):
    """
    Calculate x_i as given by the above formula
    """
    f_ones = np.array([(game[0, 0] * (i - 1) + game[0, 1] * (N - i)) / (N - 1) for i in range(1, N)])
    f_twos = np.array([(game[1, 0] * i + game[1, 1] * (N - i - 1)) / (N - 1) for i in range(1, N)])
    gammas = f_twos / f_ones
    return (1 + np.sum(np.cumprod(gammas[:i-1]))) / (1 + np.sum(np.cumprod(gammas)))

In [7]:
import sympy as sym
import numpy as np
r = sym.symbols("r")
game = np.array([[sym.S(0), sym.S(2)], [sym.S(r), sym.S(0)]])
x_1 = theoretic_fixation(N=3, game=game)
x_1, x_1.subs({r: 2})


Out[7]:
(1/(r**2/4 + r/4 + 1), 2/5)

(d) Finding an $r$

Finding $r$ such that $x_1>.9$ corresponds to:

$$\frac{1}{\frac{r^{2}}{4} + \frac{r}{4} + 1}>9/10$$

which corresponds to:

$$ \begin{aligned} \left(\frac{r^{2}}{4} + \frac{r}{4} + 1\right)9/10 & < 1 &&\text{[1]}\\ \left(\frac{r^{2}}{4} + \frac{r}{4} + 1\right) & < 10/9 && \text{[1]}\\ \frac{r^{2}}{4} + \frac{r}{4} - \frac{1}{9}& < 0 && \text{[1]} \\ r ^ 2 + r - \frac{4}{9}& < 0 && \text{[1]} \\ \end{aligned}$$

The roots of this polynomial are $-4/3, 1/3$ [2], thus: $r<1/3$ ensures $x_1>9/10$. [1]

To ensure a high probability of fixation we need the fitness of an individual of the second type encountering an individual of the first type to be at most $1/3$. [1]


In [8]:
sym.solveset(r ** 2 + r  - sym.S(4) / 9, r)


Out[8]:
{-4/3, 1/3}